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It is notoriously difficult to predict the behaviour of a complex self-organizing 
system, where the interactions among d3mamical units form a heterogeneous 
topology. Even if the dynamics of each microscopic unit is known, a real 
understanding of their contributions to the macroscopic system behaviour is 
still lacking. Here, we develop information-theoretical methods to distinguish 
the contribution of each individual unit to the collective out-of-equilibrium 
dynamics. We show that for a system of units connected by a network of inter- 
action potentials with an arbitrary degree distribution, highly connected units 
have less impact on the system dynamics when compared with intermediately 
connected units. In an equilibrium setting, the hubs are often found to dictate 
the long-term behaviour. However, we find both analytically and experimen- 
tally that the instantaneous states of these units have a short-lasting effect on 
the state trajectory of the entire system. We present qualitative evidence of this 
phenomenon from empirical findings about a social network of product recom- 
mendations, a protein-protein interaction network and a neural network, 
suggesting that it might indeed be a widespread property in nature. 



1. Introduction 

Many non-equilibrium systems consist of dynamical units that interact through a 
network to produce complex behaviour as a whole. In a wide variety of such 
systems, each unit has a state that quasi-equiHbrates to the distribution of states 
of the units it interacts with, or 'interaction potential', which results in the new 
state of the unit. This assumption is also known as the local thermodynamic equi- 
librium (LTE), originally formulated to describe radiative transfer inside stars [1,2]. 
Examples of systems of coupled units that have been described in this manner 
include brain networks [3-6], cellular regulatory networks [7-11], immune net- 
works [12,13], social interaction networks [14-20] and financial trading markets 
[15,21,22]. A state change of one unit may subsequently cause a neighbour unit 
to change its state, which may, in turn, cause other units to change, and so on. 
The core problem of understanding the system's behaviour is that the topology 
of interactions mixes cause and effect of units in a complex manner, making it 
hard to tell which units drive the system dynamics. 

The main goal of complex systems research is to understand how the 
dynamics of individual units combine to produce the behaviour of the 
system as a whole. A common method to dissect the collective behaviour 
into its individual components is to remove a unit and observe the effect 
[23-32]. In this manner, it has been shown, for instance, that highly connected 
units or hubs are crucial for the structural integrity of many real-world systems 
[28], i.e. removing only a few hubs disconnects the system into subnetworks 
which can no longer interact. On the other hand, Tanaka ei al. [32] find that 
sparsely connected units are crucial for the dynamical integrity of systems 
where the remaining (active) units must compensate for the removed (failed) 
units. Less attention has been paid to study the interplay of the unit dynamics 
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and network topology, from which the system's behaviour 
emerges, in a non-perturbative and unified manner. 

We introduce an information-theoretical approach to 
quantify to what extent the system's state is actually a rep- 
resentation of an instantaneous state of an individual unit. 
The minimum number of yes /no questions that is required 
to determine a unique instance of a system's state is called 
its entropy, measured in the unit bits [33]. If a system state 

can be in state / with probability pi, then its Shannon 
entropy is 

H(S') = -^p,log2Pi. (1.1) 

i 

For example, to determine a unique outcome of N fair coin 
flips requires N bits of information, that is, a reduction of 
entropy by N bits. The more bits of a system's state are 
determined by a prior state s|° of a unit Si at time to, the 
more the system state depends on that unit's state. This quan- 
tity can be measured using the mutual information between 
s|° and S\ defined as 

I(S';s|»)=H(S')-H(S'|sf), (1.2) 

where H(X\Y) is the conditional variant of H(X). As time 
passes {t ^ 00 ), becomes more and more independent of 

until eventually the unit's state provides zero information 
about St. This mutual information integrated over time f is a 
generic measure of the extent that the system state trajectory 
is dictated by a unit. 

We consider large static networks of identical units whose 
dynamics can be described by the Gibbs measure. The Gibbs 
measure describes how a unit changes its state subject to the 
combined potential of its interacting neighbours, in case the 
LTE is appropriate and using the maximum-entropy prin- 
ciple [34,35] to avoid assuming any additional structure. 
In fact, in our LTE description, each unit may even be a sub- 
system in its own right in a multi-scale setting, such as a cell 
in a tissue or a person in a social network. In this viewpoint, 
each unit can actually be in a large number of (unobservable) 
microstates which translate many-to-one to the (observable) 
macrostates of the unit. We consider that at a small timescale, 
each unit probabilistically chooses its next state depending on 
the current states of its neighbours, termed discrete-time 
Markov networks [36]. Furthermore, we consider random 
interaction networks with a given degree distribution p{k), 
which denotes the probability that a randomly selected unit 
has k interactions with other units, and which have a maxi- 
mum degree k^^Lx that grows less than linear in the network 
size N. Self-loops are not allowed. No additional topological 
features are imposed, such as degree -degree correlations or 
community structures. An important consequence of these 
assumptions for our purpose is that the network is 'locally 
tree-like' [37,38], i.e. link cycles are exceedingly long. 

We show analytically that for this class of systems, the 
impact of a unit's state on the short-term behaviour of 
the whole system is a decreasing function of the degree k 
of the unit for sufficiently high k. That is, it takes a relatively 
short time-period for the information about the instantaneous 
state of such a high-degree unit to be no longer present in the 
information stored by the system. A corollary of this finding 
is that if one would observe the system's state trajectory for a 
short amount of time, then the (out-of-equilibrium) behav- 
iour of the system cannot be explained by the behaviour of 
the hubs. In other words, if the task is to optimally predict 



the short-term system behaviour after observing a subset of 
the units' states, then high-degree units should not be chosen. 

We validate our analytical predictions using numerical 
experiments of random networks of 6000 ferromagnetic 
Ising spins where the number of interactions /c of a spin is dis- 
tributed as a power-law p{k) oc k~^. Ising-spin dynamics are 
extensively studied and are often used as a first approxi- 
mation of the dynamics of a wide variety of complex 
physical phenomena [37]. We find further qualitative evi- 
dence in the empirical data of the dynamical importance of 
units as function of their degree in three different domains, 
namely viral marketing in social networks [39], evolutionary 
conservation of human proteins [40] and the transmission of 
a neuron's activity in neural networks [41]. 

2. Results 

2.1. Information dissipation time of a unit 

As a measure of the dynamical importance of a unit s, we cal- 
culate its information dissipation time (IDT), denoted D(s). In 
words, it is the time it takes for the information about the 
state of the unit s to disappear from the network's state. As 
another way of describing it, it is the time it takes for the net- 
work as a whole to forget a particular state of a single unit. 
Here, we derive analytically a relation between the number 
of interactions of a unit and the IDT of its state. Our 
method to calculate the IDT is a measure of cause and 
effect and not merely of correlation; see appendix for details. 

2.1.1. Terminology 

A system S consists of units Si, S2, . . . among which some 
pairs of units, called edges, E = {si, Sj), {s^, S/), . . . interact 
with each other. Each interaction is undirected, and the 
number of interactions that involve unit is denoted by ki, 
called its degree, which equals k with probability p{k), called 
the degree distribution. The set of ki units that interacts 
with directly is denoted by hi = {x : (s/, x) G E}. The state of 
unit Si at time t is denoted by s\, and the collection 
= , S2 , . . . , sjs^ forms the state of the system. Each unit 
probabilistically chooses its next state based on the current 
state of each of its nearest-neighbours in the interaction net- 
work. Unit Si chooses the next state x with the conditional 
probability distribution p(s-^^ = This is also known as 
a Markov network. 

2.1.2. Unit dynamics in the local thermodynamic equilibrium 

Before we can proceed to show that D(s) is a decreasing func- 
tion of the degree k of the unit s, we must first define the class 
of unit dynamics in more detail. That is, we first specify an 
expression for the conditional probabilities p{s^^^ = r\U). 

We focus on discrete-time Markov networks, so the 
dynamics of each unit is governed by the same set of 
conditional probabilities p{s^^^ = r\h^) with the Markov prop- 
erty. In our LTE description, a unit chooses its next state 
depending on the energy of that state, where the energy land- 
scape induced by the states of its nearest-neighbours through 
its interactions. That is, each unit can quasi-equilibrate its 
state to the states of its neighbours. The higher the energy 
of a state at a given time, the less probable the unit chooses 
the state. Stochasticity can arise if multiple states have an 
equal energy, and additional stochasticity is introduced by 



means of the temperature of the heat bath that surrounds the 
network. 

The consequence of this LTE description that is relevant to 
our study is that the state transition probability of a unit is an 
exponential function with respect to the energy. That is, in a 
discrete-time description, chooses s^+^ = r as the next state 
with a probability 
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(2.1) 



where T is the temperature of the network's heat bath and 
J^ye(r|sj) is the energy of state r given the states of its inter- 
acting neighbours sj G /z^ As a result, the energy landscape 
of r does not depend on individual states of specific neigh- 
bour units; it depends on the distribution of neighbour states. 

2.1.3. Information as a measure of dynamical impact 

The instantaneous state of a system consists of H(S^) bits of 
Shannon information. In other words, H(S^) answers to 
unique yes /no questions (bits) must be specified in order to 
determine a unique state S\ As a consequence, the more 
bits about are determined by the instantaneous state sf 
of a unit S/ at time to < t, the more the system state depends 
on the unit's state s|° . 

The impact of a unit's state s|° on the system state at a 
particular time t can be measured by their mutual information 
J(S^; s|°). In the extreme case that s|° fully determines the state 
S\ the entropy of the system state coincides with the entropy of 
the unit state, and the dynamical impact is maximum at 
H(SO = H(s|°) = I(S^|s|°). In the other extreme case, the unit 
state s|° is completely irrelevant to the system state S\ the 
information is minimum at J(S^; s|°) =0. 

The decay of this mutual information over time (as f ^ oo) 
is then a measure of the extent that the system's state trajec- 
tory is affected by an instantaneous state of the unit. In 
other words, it measures the 'dynamical importance' of the 
unit. If the mutual information reaches zero quickly, then 
the state of the unit has a short-lasting effect on the collective 
behaviour of the system. The longer it takes for the mutual 
information to reach zero, the more influential is the unit to 
the system's behaviour. We call the time it takes for the 
mutual information to reach zero the IDT of a unit. 

2.1.4. Defining the information dissipation time of a unit 

At each time step, the information stored in a unit's state s\ is 
partially transmitted to the next states of its nearest- 
neighbours [42,43], which, in turn, transmit it to their 
nearest-neighbours, and so on. The state of unit s at time t 
dictates the system state at the same time t to the amount of 
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with the understanding that unit s has k interactions. We use 
the notation instead of Jg, because all units that have k inter- 



actions are indistinguishable in our model. At time t -\- 1, the 
system state is still influenced by the unit's state s^, the 
amount of which is given by 
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(2.3) 



As a result, a unit with k connections locally dissipates its infor- 
mation at a ratio Ii/Iq per time step. Here, we use the 
observation that the information about a unit's state s^, which 
is at first present at the unit itself at the maximum amount 



H(s^), can be only transferred at time f + 1 to the direct 
neighbours h of s, through nearest-neighbour interactions. 

At subsequent time steps (f + 2 and onward), the infor- 
mation about the unit with an amount of will dissipate 
further into the network at a constant average ratio 
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(2.4) 



from its neighbours, neighbours-of-neighbours, etc. This is 
due to the absence of degree -degree correlations or other 
structural bias in the network. That is, the distribution q{m) 
of the degrees of a unit's neighbours (and neighbours-of- 
neighbours) does not depend on its own degree k. Here, 
q(m) = (m + l)j){m + \){m)~^ is the probability distribution 
of the number of additional interactions that a nearest- 
neighbour unit contains besides the interaction with unit s, 
or the interaction with a neighbour of unit s, etc., called the 
excess degree distribution [44]. As a consequence, the disse- 
mination of information of all nodes occurs at an equal 
ratio per time step except for the initial amount of infor- 
mation l\, which the k neighbour states contain at time 
^ + 1, which depends on the degree k of the unit. Note that 
this definition of I ignores the knowledge that the source 
node has exactly k interactions, which at first glance may 
impact the ability of the neighbours to dissipate information. 
However, this simplification is self-consistent, namely we will 
show that l\ diminishes for increasing k: this reduces the dis- 
sipation of information of its direct neighbours, which, in 
turn, reduces l\ for increasing k, so that our conclusion that 
l\ diminishes for increasing k remains valid. See also appen- 
dix A for a second line of reasoning, about information 
flowing back to the unit s. 

In general, the ratio per time step at which the infor- 
mation about s\ dissipates from f + 2 and onward equals I 
up to an 'efficiency factor' that depends on the state -state 
correlations implied by the conditional transition probabil- 
ities ip{s^^^\s^^. For example, if dictates 20% of the 
information stored in its neighbour state Sg^^, and Sg^^, in 
turn, dictates 10% of the information in s[t^, then J(s^;s[t^) 
may not necessarily equal 20% x 10% = 2% of the infor- 
mation H(s[t^) stored in s^^. That is, in one extreme, Sg^^ 
may use different state variables to influence than 
the variables that were influenced by s^, in which case 
J(s^;s^^) is zero, and the information transmission is ineffi- 
cient. In the other extreme, if Sg^^ uses only state variables 
that were set by to influence s[^^, then passing on A's 
information is optimally efficient and J(s^; s^^) = 10%. 
Therefore, we assume that at every time step from time 
t + 2 onward, the ratio of information about a unit that is 
passed on is Cgff • I, i.e. corrected by a constant factor 
0 < Cgff < l/I that depends on the similarity of dynamics of 
the units. It is non-trivial to calculate Cgff but its bounds are 
sufficient for our proceeding. 

Next, we can define the IDT of a unit. The number of time 
steps it takes for the information in the network about unit s 
with degree k to reach an arbitrarily small constant e is 



log e- log If 
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(2.5) 



Note that D(s) is not equivalent to the classical correlation 
length. The correlation length is a measure of the time it 
takes for a unit to lose a certain fraction of its original 
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Figure 1. The dynamical impact D[s) of a ferromagnetic Ising spin s as function of its connectivity A^, from evaluating the analytical D[s) in equation (2.5) as well as from 
numerical experiments. For the analytical calculations, we used Glauber dynamic to describe the behaviour of the units; for the computer experiments, we used the Metro- 
polis-Hastings algorithm. For the latter, we simulate a network of 6000 spins with a power-law degree distribution p(k)ock~ the plots are the result of six realizations, 
each of which generated 90 000 time series of unit states that lead up to the same system state, which was chosen randomly after equilibration. The grey area is within two 
times the standard error of the mean IDT of a unit with a given connectivity, (a) T= 2.0, (b) T= 2.5, (c) T= 2.75, (d)T= 9.0, (e) T=M and (f) T= 14. 
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correlation with the system state, instead of the time it takes 
for the unit to reach a certain absolute value of correlation. 
For our purpose of comparing the dynamical impact of 
units, the correlation length would not be a suitable measure. 
For example, if unit A has a large initial correlation with the 
system state and another unit B has a small initial correlation, 
but the halftime of their correlation is equal, then, in total, we 
consider A to have more impact on the system's state because 
it dictates more bits of information of the system state. 

2.2. Diminishing information dissipation time of hubs 

As a function of the degree k of unit s, the unit's IDT satisfies 
D(s) oc const + log Ji, (2.6) 

because I, c and s are independent of the unit's degree. Here, 
the proportionality factor equals — (log CqH + log J)~^, which is 
non-negative, because the dissipation ratio Cgff • J is at most 1, 
and the additive constant equals -loge, which is positive as 
long as 8 < 1. Because the logarithm preserves order, to show 
that the IDT diminishes for high-degree units, it is sufficient 
to show that decreases to a constant, as /c ^ oo, which we 
do next. 

The range of the quantity l\ is 

o<i\< (2-7) 

due to the conditional independence among the neighbour states 
sj^^ given the node state s\. In the average case, the upper bound 
can be written as k ■ (I(sj^^ ; s-))^. , and we can write as 

l\ = U{k)-k-T{k), where 

m = {l{s]-^'-s\)\., (2.8) 

where T(k) is the information in a neighbour unit's next state 
averaged over its degree, and U{k) is the degree of 'uniqueness' 
of the next states of the neighbours. The operator ( • \. denotes 
an average over the degree kj of a neighbour unit Sy, i.e. weighted 
by the excess degree distribution q(jcj - 1). In one extreme, the 
uniqueness function U{k) equals unity in case the information 



of a neighbour does not overlap with that of any other neighbour 
unit of s\, i.e. the neighbour states do not correlate. It is less than 
unity to the extent that information does overlap between neigh- 
bour units, but is never negative. See §S3 in the electronic 
supplementary material for a detailed derivation of an exact 
expression and bounds of the uniqueness function U{k). 

Because the factor U(k) • A: is at most a linear growing func- 
tion of k, a sufficient condition for D{Si) to diminish as A: ^ oo 
is for T{k) to decrease to zero more strongly than linear in k. 
After a few steps of algebra (see appendix), we find that 

T{k + l) = a- T{k), where a<\. (2.9) 

Here, equality for a only holds in the degenerate case 
where only a single state is accessible to the units. In 
words, we find that the expected value of T{k) converges 
downward to a constant at an exponential rate as k^ oo. 
Because each term is multiplied by a factor a<l, this conver- 
gence is downward for most systems but never upward even 
for degenerate system dynamics. 

2.3. Numerical experiments with networks of 
Ising spins 

For our experimental validation, we calculate the IDT D(s) of 
6000 ferromagnetic spins with nearest-neighbour interactions 
in a heavy-tailed network in numerical experiments and find 
that it, indeed, diminishes for highly connected spins. In 
figure 1, we show the numerical results and compare them 
with the analytical results, i.e. evaluating equation (2.5). 

The analytical calculations use the single-site Glauber 
dynamics [45] to describe how each spin updates its state 
depending on the states of its neighbours. In this dynamics, 
at each time step, a single spin chooses its next state accord- 
ing to its stationary distribution of state, which would be 
induced if its nearest-neighbour spin states would be fixed 
to their instantaneous value (LTE). We calculate the upper 
bound of D(s) by setting U{k) = 1, that is, all information 
about a unit's state is assumed to be unique that optimizes 
its IDT. A different constant value for U{k) would merely 
scale the vertical axis. 
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Figure 2. The level of activity of a set of neurons under a microscope as function of time, after seeding one neuron with an electrical potential (black line). The 
activity was measured by changes in calcium ion concentrations. These concentrations were detected by imaging fluorescence levels relative to the average flu- 
orescence of the neurons (activity 0) measured prior to activation. In the sparse cultures with few synapses per neuron, the stimulated neuron evokes a network 
burst of activity in all other neurons in the field after a short delay. By contrast, in the dense cultures with many synapses per neuron, only the stimulated neuron 
has an increased potential. The data for these plots were kindly provided by Ivenshitz & Segal [41]. (a) Low connectivity and (b) high connectivity. 



We perform computer simulations to produce time series 
of the states of 6000 ferromagnetic Ising spins and measure 
the dynamical importance of each unit by regression. For 
each temperature value, we generate six random networks 
with p{k) ock~^ for 7= 1.6 and record the state of each spin 
at 90 000 time steps. The state of each unit is updated using 
the Metropolis -Hastings algorithm instead of the Glauber 
update rule to show generality. In the Metropolis -Hastings 
algorithm, a spin will always flip its state if it lowers the inter- 
action energy; higher energy states are chosen with a 
probability that decreases exponentially as function of the 
energy increase. Of the resulting time series of the unit 
states, we computed the time di where I{s\^^\ sj^^'; s-) = s 
of each unit Si by regression. This is semantically equivalent 
to D{Si) but does not assume a locally tree-like structure or a 
uniform information dissipation rate I. In addition, it ignores 
the problem of correlation (see appendix A). See section SI 
in the electronic supplementary material for methodological 
details; see section S2 in the electronic supplementary material 
for results using higher values of the exponent 7. The results 
are presented in figure 1. 



2.4. Empirical evidence 

We present empirical measurements from the literature of the 
impact of units on the behaviour of three different systems, 
namely networks of neurons, social networks and protein 
dynamics. These systems are commonly modelled using a 
Gibbs measure to describe the unit dynamics. In each case, 
the highly connected units turn out to have a saturating or 
decreasing impact on the behaviour of the system. This 
provides qualitative evidence that our IDT, indeed, character- 
izes the dynamical importance of a unit, and, consequently, 
that highly connected units have a diminishing dynamical 
importance in a wide variety of complex systems. In each 
study, it remains an open question which mechanism is 
responsible for the observed phenomenon. Our work pro- 
poses a new candidate explanation for the underlying cause 
for each case, namely that it is an inherent property of the 
type of dynamics that govern the units. 

The first evidence is found in the signal processing of 
in vitro networks of neurons [41]. The denser neurons are 
placed in a specially prepared Petri dish, the more connec- 
tions (synapses) each neuron creates with other neurons. In 
their experiments, Ivenshitz and Segal found that sparsely 



connected neurons are capable of transmitting their electrical 
potential to neighbouring neurons, whereas densely con- 
nected neurons are unable to trigger network activity even 
if they are depolarized in order to discharge several action 
potentials. Their results are summarized in figure 2. In 
search for the underlying cause, the authors exclude some 
obvious candidates, such as the ratio of excitatory versus 
inhibitory connections, the presence of compounds that 
stimulate neuronal excitability and the size of individual 
posts3maptic responses. Although the authors do find tell- 
tale correlations, for example, between the network density 
and the structure of the dendritic trees, they conclude that 
the phenomenon is not yet understood. Note that in this 
experiment, the sparsely connected neuron is embedded in 
a sparsely connected neural network, whereas the densely 
connected neuron is in a dense network. A further validation 
would come from a densely connected neuron embedded in 
a sparse network in order to disentangle the network's 
contribution from the individual effect. 

Second, in a person-to-person recommendation network 
consisting of four million persons, Leskovec et al. [39] found 
that the most active recommenders are not necessarily the 
most successful. In the setting of word-of-mouth market- 
ing among friends in the social networks, the adoption rate 
of recommendations saturates or even diminishes for the 
highly active recommenders, which is shown in figure 3 for 
four product categories. This observation is remarkable, 
because in the dataset, the receiver of a recommendation 
does not know how many other persons receive it as well. 
As a possible explanation, the authors hypothesize that 
widely recommended products may not be suitable for 
viral marketing. Nevertheless, the underlying cause remains 
an open question. We propose an additional hypothesis, 
namely that highly active recommenders have a diminishing 
impact on the opinion forming of others in the social net- 
work. In fact, the model of Ising spins in our numerical 
experiments is a widely used model for opinion forming in 
social networks [14-16,18,20]. As a consequence, the results 
in figure 1 may be interpreted as estimating the dynamical 
impact of a person's opinion as function of the number of 
friends that he debates his opinion with. 

The third empirical evidence is found in the evolutionary 
conservation of human proteins [40]. According to the neutral 
model of molecular evolution, most successful mutations in 
proteins are irrelevant to the functioning of the system of 
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Figure 3. The success of a person's recommendation of a product as function of the number of recommendations that he sent. A person could recommend a product 
to friends only after he purchased the product himself. The success is measured as a normalized rate of receivers buying the product upon the recommendation. The 
normalization counts each product purchase equally in terms of the system's dynamics, as follows: if a person receives multiple recommendations for the same 
product from different senders, a 'successful purchase' is only accounted to one of the senders. The grey area is within 1 s.e.m. The total recommendation network 
consists of four million persons who made 16 million recommendations about half a million products. The subnetworks of the books and DVDs categories are by far 
the largest and most significant, with 73% of the persons recommending books and 52% of the recommendations concerning DVDs. The data for these plots were 
kindly provided by Leskovec et al. [39]. (a) DVD, (b) books, (c) music and (d) video. 



protein-protein interactions [46]. This means that the evolutio- 
nary conservation of a protein is a measure of the intolerance of 
the organism to a mutation to that protein, i.e. it is a measure 
of the dynamical importance of the protein to the reproduci- 
bility of the organism [47]. Brown & Jurisica [40] measured 
the conservation of human proteins by mapping the human 
protein -protein interaction network to that of mice and rats 
using 'orthologues', which is shown in figure 4. Two proteins 
in different species are orthologous if they descend from a 
single protein of the last common ancestor. Their analysis 
reveals that the conservation of highly connected proteins is 
inversely related with their connectivity. Again, this is consist- 
ent with our analytical prediction. The authors conjecture that 
this effect may be due to the overall high conservation rate, 
approaching the maximum of 1 and therefore affecting the 
statistics. We suggest that it may indeed be an inherent property 
of protein interaction dynamics. 



3. Discussion 

We find that various research areas encounter a diminishing 
dynamical impact of hubs that is unexplained. Our analy- 
sis demonstrates that this phenomenon could be caused 
by the combination of unit dynamics and the topology of 
their interactions. We show that in large Markov networks, 
the dynamical behaviour of highly connected units have a 
low impact on the dynamical behaviour of the system as 
a whole, in the case where units choose their next state 
depending on the interaction potential induced by their 
nearest-neighbours. 




Figure 4. The fraction of evolutionary conservation of human proteins as a 
function of their connectivity k. The fraction of conservation is measured as 
the fraction of proteins that have an orthologous protein in the mouse (circles) 
and the rat (crosses). The dashed and dot-dashed curves show the trend of the 
conservation rates compared with mice and rates, respectively. They are calcu- 
lated using a Gaussian smoothing kernel with a standard deviation of 10 data 
points. To evaluate the significance of the downward trend of both conservation 
rates, we performed a least-squares linear regression of the original data points 
starting from the peaks in the trend lines up to /: = 70. For the fraction of 
orthologues with mice, the slope of the regression line is -0.00347 + 
0.00111 (mean and standard error); with rats, the slope is -0.00937 + 
0.00594. The vertical bars denote the number of proteins with k interactions 
in the human protein -protein interaction network (logarithmic scale). The 
data for these plots were kindly provided by Brown & Jurisica [40]. 



For highly connected units, this type of dynamics enables 
the LTE assumption, originally used for describing radiative 
transport in a gas or plasma. To illustrate LTE, there is no 



single temperature value that characterizes an entire star: the 
outer shell is cooler than the core. Nonetheless, the mean free 
path of a moving photon inside a star is much smaller than 
the temperature gradient, so on a small timescale, the pho- 
ton's movement can be approximated using a local 
temperature value. A similar effect is found in various sys- 
tems of coupled units, such as social networks, gene 
regulatory networks and brain networks. In such systems, 
the internal dynamics of a unit is often faster than a change 
of the local interaction potential, leading to a multi-scale 
description. Intuitive examples are the social interactions in 
blog websites, discussion groups or product recommendation 
services. Here, changes that affect a person are relatively slow 
so that he can assimilate his internal state-of-mind (the unit's 
microstate) to his new local network of friendships and the 
set of personal messages he received, before he makes 
the decision to add a new friend or send a reply (the unit's 
macrostate). Indeed, this intuition combined with our 
analysis is consistent with multiple observations in social 
networks. Watts & Doods [48] numerically explored the 
importance of 'influentials', a minority of individuals who 
influence an exceptional number of their peers. They find 
counter to intuition that large cascades of influence are 
usually not driven by influentials, but rather by a critical 
mass of easily influenced individuals. Granovetter [49] 
found that even though hubs gather information from differ- 
ent parts of the social network and transmit it, the clustering 
and centrality of a node provide better characteristics for dif- 
fusing innovation [50]. Rogers [51] found experimentally that 
the innovator is usually an individual in the periphery of the 
network, with few contacts with other individuals. 

Our approach can be interpreted in the context of how 
dynamical systems intrinsically process information [42,43, 
52-56]. That is, the state of each unit can be viewed as a 
(hidden) storage of information. As one unit interacts with 
another unit, part of its information is transferred to the 
state of the other unit (and vice versa). Over time, the infor- 
mation that was stored in the instantaneous state of one unit 
percolates through the interactions in the system, and at the 
same time it decays owing to thermal noise or randomness. 
The longer this information is retained in the system state, 
the more the unit's state determines the state trajectory of 
the system. This is a measure of the dynamical importance 
of the unit, which we quantify by D(s). 

Our work contributes to the understanding of the behav- 
iour of complex systems at a conceptual level. Our results 
suggest that the concept of information processing can be 
used, as a general framework, to infer how dynamical units 
work together to produce the system's behaviour. The 
inputs to this inference are both the rules of unit dynamics 
as well as the topology of interactions, which contrasts with 
most complex systems research. A popular approach to 
infer the importance of units in general are topology-only 
measures such as connectedness and betweenness-centrality 
[28,30,57-62], following the intuition that well-connected or 
centrally located units must be important to the behaviour 
of the system. We demonstrate that this intuition is not 
necessarily true. A more realistic approach is to consider 
to simulate a simple process on the topology, such as the 
percolation of particles [63], magnetic spin interactions 
[3,6,14,20,37,64-72] or the synchronization of oscillators 
[37,60,73-80]. The dynamical importance of a unit in a 
such model is then translated to that of the complex system 



under investigation. Among the 'totalistic' approaches that 
consider the dynamics and interaction topology simul- 
taneously, a common method to infer a unit's dynamical 
importance is to perform 'knock-out' experiments [29-31]. 
That is, experimentally removing or altering a unit and 
observing the difference in the system's behaviour. This is a 
measure of how robust the system is to a perturbation, how- 
ever, and care must be taken to translate robustness into 
dynamical importance. In case the perturbation is not part 
of the natural behaviour of the system, then the perturbed 
system is not a representative model of the original system. 
To illustrate, we find that highly connected ferromagnetic 
spins hardly explain the observed dynamical behaviour of a 
system, even though removing such a spin would have a 
large impact on the average magnetization, stability and 
critical temperature [81,82]. In summary, our work is an 
important step towards a unified framework for understand- 
ing the interplay of the unit dynamics and network topology 
from which the system's behaviour emerges. 

Acknowledgements. We thank Carlos P. Fitzsimons for helping us find 
and interpret empirical evidence from the field of neurobiology. 
We also thank Gregor Chliamovitch and Omri Har-Shemesh for 
their feedback on the mathematical derivations. 
Funding statement. We acknowledge the financial support of the Future 
and Emerging Technologies (FET) programme within the Seventh 
Framework Programme (FP7) for Research of the European Commis- 
sion, under the FET-Proactive grant agreement TOPDRIM, number 
FP7-ICT-318121, as well as under the FET-Proactive grant agreement 
Sophocles, number FP7-ICT-3 17534. P.M.A.S. acknowledges the NTU 
Complexity Programme in Singapore and the Leading Scientist Pro- 
gramme of the Government of the Russian Federation, under 
contract no .11.G34.31.0019. 



Appendix A 

A.I. Limiting behaviour of p{s]^^ 
k- 
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Using equation (2.1), the prior probability of a unit's state can 
be written as 

P(st''=q)= E P(h\ = r)-e r-^ .Z,-i, (Al) 

where is the partition function for a unit with k edges. As 
/c > the set of interaction energies starts to follow a 
stationary distribution of nearest-neighbour states, and the 
expression can be approximated as 



(A 2) 



Here, (e^^) is the expected interaction energy of the state q with 
one neighbour, averaged over the neighbours' state distri- 
bution. If an edge is added to such a unit, the expression 
becomes (the subscript /c + 1 denotes the degree of the node 
as a reminder) 



p,+i(s| = <?) = e-(*+i)<^'VT.z-_i 
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In words, the energy term for each state q is multiplied by a 
factor e~^^^^/^ that depends on the state but is constant with 
respect to k. (The partition function changes with k to suitably 
normalize the new terms, but it does not depend on q and so 
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is not responsible for moving probability mass.) That is, as k 
grows, the probability of the state q with the lowest expected 
interaction energy approaches unity; the probabilities of all 
other states will approach zero. The approaches are exponen- 
tial, because the multiplying factors do not depend on k. 
If there are m states with the lowest interaction energies 
(multiplicity m), then each probability of these states will 
approach 1/m. 



A.2. Deriving an upper bound on a in 

First, we write T{k) as an expected mutual information between 
the state of a unit and the next state of its neighbour, where the 
average is taken over the degree of the neighbour unit: 

T{k)={H{s])-H{sl\s'^%. (A 4) 

We will now study how T{k) behaves as k grows for large 
k. By definition, both entropy terms are non-negative, and 
H(s[|sj+^) < H{s\). In §A.l of this appendix, we find that the 
prior probabilities of the state of a high-degree unit exponen- 
tially approach either zero from above or a constant from 
below. In the following, we assume that this constant is 
unity for the sake of simplicity, i.e. that there is only one 
state with the lowest possible interaction energy. 

= - E (1 - ^,"')iog(i - - E 
= - E (1 - K') i°g(i - K") + ^ E i°g 

q^%+ qEX 

= 0{k-x-^). 

(A 5) 

In words, the first entropy term eventually goes to zero 
exponentially as function of the degree of a unit. Because 
this entropy term is the upper bound on the function T(/c), 
there are three possibilities for the behaviour of T(/c). The 
first option is that T{k) is zero for all k, which is a degenerate 
system without dynamical behaviour. The second option is 
that T{k) is a monotonically decreasing function of k, and 
the third option is that T{k) first increases and then decreases 
as function of k. In both cases, for large k the function, T{k) 
must approach zero exponentially. 

In summary, we find that for large k 

T{k + l) = a- T(^), where a<l. (A 6) 

The assumption of multiplicity unity of the lowest inter- 
action energy is not essential. If this assumption is relieved, 
then in step 3 of equation (A 5), then the first term does not 
become zero but a positive constant. It may be possible that 
a system where T(k) equals this constant across k is not degen- 
erate, in contrast to the case of multiplicity unity, so in this 
case, we must relax the condition in equation (A 6) to include 
the possibility that all units are equally important, i.e. a<l. 
This still makes it impossible for the impact of a unit to keep 
increasing as its degree grows. 



A.3. Information flowing bad to a 
high-degree unit 

In the main text, we simplify the information flow through the 
network by assuming that the information at the amount l\ 
stored in the neighbours of a unit flows onward into the net- 
work, and does not flow back to the unit. Here, we rationalize 
that this assumption is appropriate for high-degree units. 

Suppose that at time t + 1, the neighbour unit Sy stores 
J(s[;sj+^) bits of information about the state s\. At time 
f + 2, part of this information will be stored by two variables: 
the unit's own state s-^^ and the combined variable of neigh- 
bour-of-neighbour states {syi, Sy^.}. In order for the IDT 
D(s/) of unit Si to be affected by the information that flows 
back, this information must add a (significant) amount to 
the total information at time t + We argue however that 
this amount is insignificant, i.e. 

J(s[;S'+2)-I(s[;{s«,...,s^+^}) 

= /(s|;sr^|{s;f,...,s;+n)'^'^~0. (A7) 

The term J(s-; s[+^|{Sy|^,..., s^^,^}) is the conditional mutual 
information. Intuitively, it is the information that s[^^ stores 
about s\ which is not already present in the states 

^■•■^^ikj J' 

The maximum amount of information that a variable can 
store about other variables is its entropy, by definition. It fol- 
lows from sections A.l and A.2 of appendix that the entropy 
of a high-degree unit is lower than the average entropy of a 
unit. In fact, in the case of multiplicity unity of the lowest inter- 
action energy the capacity of a unit goes to zero as /c ^ oo. For 
this case, this proves that J(s-; s[+^|{s^^^^,..., s^^^^}), indeed, goes 
to zero. For higher multiplicities, we observe that the entropy 
H(s[+^) is still (much) smaller than the total entropy of the 
neighbours of a neighbour H{s^+^) + H(s^+^\s^+^) + • • • There- 
fore, the information I{s\;s\^^) that flows back is (much) 
smaller than J(s[; {s^^|^,..., s^^^}), and the conditional variant 
is presumably smaller still. Therefore, we assume that also in 
this case, the information that flows back has an insignificant 
effect on D(s/). 

A.4. A note on causation versus correlation 

In the general case, the mutual information J (s^ ; 5^° ) between the 
state of unit Sx at time to and another unit's state Sy at time t is the 
sum of two parts: Icausai/ which is information that is due to a 
causal relation between the state variables, and Jcorr/ which is 
information due to 'correlation' that does not overlap with the 
causal information. Correlation occurs if the units and Sy 
both causally depend on a third 'external' variable e in a similar 
manner, i.e. such that J(e; (s^, 5^°)^) < J(e; s^) + I{e; s^^). This can 
lead to a non-zero mutual information J(s^; s^o) among the two 
units, even if the two units would not directly depend on each 
other in a causal manner [83,84]. 

For this reason, we do not directly calculate the depen- 
dence of J(S^;s^o) on the time variable t in order to calculate 
the IDT of a unit s. It would be difficult to tell how much 
of this information is non-causal at every time point. In 
order to find this out, we would have to understand exactly 
how each bit of information is passed onward through the 
system, from one state variable to the next, which we do 
not yet understand at this time. 



To prevent measuring the non-causal information present 
in the network, we use local single-step 'kernels' of infor- 
mation diffusion, namely the I\/Iq as discussed previously. 
The information Iq is trivially of causal nature (i.e. non- 
causal information is zero), because it is fully stored in the 
state of the unit itself. Although, in the general case, may 
consist of a significant non-causal part, in our model, we 
assume this to be zero or at most an insignificant amount. 
The rationale is that units do not self-interact (no self- 
loops), and the network is locally tree-like: if and Sy are 
direct neighbours, then there is no third with 'short' inter- 
action pathways to both and Sy. The only way that non- 
causal (i.e. not due to influencing s^^^) information can 
be created between and 5^+^ is through the pair of inter- 
action paths s^' ^ • • • ^ Sy~^ and ^ • • • ^ 5^+^, 
where t' <t — \. That is, one and the same state variable 
must causally influence both and 5^+^, where it can reach 

only through Sy. We expect any thusly induced non- 



causal information in J(Sy+^;s^) is insignificant compared 
with the causal information through s\ 5^+^, and the 
reason is threefold. First, the minimum lengths of the two 
interaction paths from are two and three interactions, 
respectively, where information is lost through each inter- 
action due to its stochastic nature. Second, of the 
information that remains, not all information J(s^ ;s^) may 
overlap with J(s^';Sy+^), but even if it does, then the 'corre- 
lation part' of the mutual information J(Sy+^;s^) due to this 
overlap is upper bounded by their minimum: 
min{J(s^';s^), J(s^';Sy+^)). Third, the mutual information due 
to correlation may, in general, overlap with the causal infor- 
mation, i.e. both pieces of information may be partly about 
the same state variables. That is, the Jcorr part of J(Sy+^;s^), 
which is the error of our assumption, is only that part of the 
information-due-to-correlation that is not explained by (con- 
tained in) /causal- The final step is the observation that l\ is the 
combination of all I(Sy+^ ; for all neighbour units Sy G h^. 
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